l = [1, 2, 3]
l[1, 2, 3]
Numpy
Numpy (but first lists)
Numbers are the basis of all computer’s computations. We rarely operate on single numbers, usually we work on the whole list/array/vector of numbers. This is what numpy is designed for.
One might ask what’s wrong with built-in list type? Let’s start with talking a closer look at list.
l = [1, 2, 3]
l[1, 2, 3]
type(l), l[0], l[-1], len(l), sum(l)(list, 1, 3, 3, 6)
Elements of single list might be of different type
l2 = [1, "a", "0"]Ways to index the list
l2[0], l2[:], l2[-1](1, [1, 'a', '0'], '0')
Since we don’t know how to add elements of l2, sum(l2) will raise an error.
sum(l2)TypeError: unsupported operand type(s) for +: 'int' and 'str'
Additional remark. (Although we shouldn’t try it) python allows sum([1, 2.5, 3j]), calculating sum of list with different numeric types.
l3 = [1, 2.5, 3j]
for e in l3:
print(f"Element {e} is of type {type(e)}")
sum(l3)Element 1 is of type <class 'int'>
Element 2.5 is of type <class 'float'>
Element 3j is of type <class 'complex'>
(3.5+3j)
Multiplying list by int repeats the list.
[0] * 5[0, 0, 0, 0, 0]
[1, 2, 3] * 4[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]
There is a subtlety however. Elements of repeated list are a shallow copies of the original lists’ elements.
The following example presents the behavior:
ll = [[]] * 3 # list with three empty lists
ll[[], [], []]
ll[0].append(42) # we append 42 to the first list
ll[0][42]
ll # the element has been added to every list! it's actually THE SAME list[[42], [42], [42]]
Two interesting threads on Stack Overflow regarding this issue:
print(l)
for e in l:
print(e**2)[1, 2, 3]
1
4
9
for i, e in enumerate(l):
print(i, e**2)0 1
1 4
2 9
It’s worth to get to know better with list comprehension, super useful.
l3 = [i**2 for i in range(10)]
l3[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Elements from the third (with index 2) till the end.
l3[2:][4, 9, 16, 25, 36, 49, 64, 81]
Elements of indices from 2 (inclusive) to 3 (exclusive) Since indexing l[i:j] always returns list, we receive a single-element list, not just value l3[2].
l3[2:3][4]
Negative indices exist in python, they go from the end of an array. Here we receive the whole list, but the last element.
l3[:-1][0, 1, 4, 9, 16, 25, 36, 49, 64]
It’s good to omit the redundant indices like bellow where we want to take all elements with indices smaller than 4.
l3[0:4] == l3[:4]True
The default indexing also allow taking every n-th element and reversing the list.
l3[::2][0, 4, 16, 36, 64]
l3[3:8:2] # From element 3 to 8, every 2 elements. Indices 3, 5, 7[9, 25, 49]
l3[::-1] # All elements, every -1 element - reversing the list[81, 64, 49, 36, 25, 16, 9, 4, 1, 0]
So after this short remainder of lists in python we can finally go to numpy.
The basic structure we in numpy is np.ndarray n-dimensional array. We can think about np.ndarray like about an list/array. For \(n=1\) it will be a vector, for \(n = 2\) it will be a matrix.
In general we can call it a tensor, but we will call them arrays as it’s the most popular name.
Every time you have some problem, try googling first, probably someone already had this problem and the solution is available on StackOverflow.
Example google inputs:
Numpy arrays creation tutorial. Let’s try to do similar things as with lists.
import numpy as npx = np.array([1, 2, 3]) # w ten sposób tworzymy array na podstawie pythonowej listy
xarray([1, 2, 3])
x[0], x[-1], len(x)(1, 3, 3)
x[1:]array([2, 3])
x[2:]array([3])
x[::-1]array([3, 2, 1])
3 in xTrue
Array has a few useful fields:
x.dtype, x.ndim # data type and number of dimensions(dtype('int64'), 1)
# number of elements in every dimension
# note that it's a single element tuple
x.shape(3,)
Now let’s create multidimensional array using reshape.
lr = list(range(12))
a = np.array(lr)
aarray([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
Note that we specify required shape as a tuple.
a2 = a.reshape((3, 4))
a2array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
Calling reshape method doesn’t change a as it returns a copy.
aarray([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11])
a2[0, :] # the first rowarray([0, 1, 2, 3])
a2[0, :] # the first columnarray([0, 1, 2, 3])
a2[2, 3] # an element from 3rd row and 4th column11
a2[:, ::2] # all rows and every second columnarray([[ 0, 2],
[ 4, 6],
[ 8, 10]])
a2[:, 1::2] # all rows and every second column starting from the one with index 1array([[ 1, 3],
[ 5, 7],
[ 9, 11]])
In practice we rarely create np.ndarray from the python list becuse we would have to first create the list only to immediately convert it into np.ndarray
Here are the typical ways to create an array.
np.zeros((2, 4))array([[0., 0., 0., 0.],
[0., 0., 0., 0.]])
np.ones((2, 3))array([[1., 1., 1.],
[1., 1., 1.]])
np.arange(10) # note that the last number is smaller than 10array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.arange(10).reshape((2, 5))array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
np.arange(2, 10, 3) # from, to, everyarray([2, 5, 8])
np.linspace(0, 5) # in linspace, the last number is included as wellarray([0. , 0.10204082, 0.20408163, 0.30612245, 0.40816327,
0.51020408, 0.6122449 , 0.71428571, 0.81632653, 0.91836735,
1.02040816, 1.12244898, 1.2244898 , 1.32653061, 1.42857143,
1.53061224, 1.63265306, 1.73469388, 1.83673469, 1.93877551,
2.04081633, 2.14285714, 2.24489796, 2.34693878, 2.44897959,
2.55102041, 2.65306122, 2.75510204, 2.85714286, 2.95918367,
3.06122449, 3.16326531, 3.26530612, 3.36734694, 3.46938776,
3.57142857, 3.67346939, 3.7755102 , 3.87755102, 3.97959184,
4.08163265, 4.18367347, 4.28571429, 4.3877551 , 4.48979592,
4.59183673, 4.69387755, 4.79591837, 4.89795918, 5. ])
np.linspace(0, 5, 5)array([0. , 1.25, 2.5 , 3.75, 5. ])
np.array([1, 2, 3]).repeat(3)array([1, 1, 1, 2, 2, 2, 3, 3, 3])
np.tile(np.array([1, 2, 3]), 3)array([1, 2, 3, 1, 2, 3, 1, 2, 3])
np.random.rand(10) # 10 random numbers from [0, 1)array([0.81627618, 0.08073465, 0.7453185 , 0.78812622, 0.99862186,
0.15712126, 0.69040646, 0.63849645, 0.79207374, 0.19497058])
Even though it hasn’t been said loudly, we could see that every element of np.ndarray was of the same type. Because of that, every element has a constant size and it allows efficient memory usage. If we really want, we can make nd.array of object data type. This way every element of a list is treated as in python list, holding a reference. This usually means we made some error on the way as working on such array will be very inefficient.
Bellow there are a few examples of how to change data type of array. Be default dtype is int or float. The dot after the number always mean it’s a floating position number (not an integer).
(np.zeros(3),
np.zeros(3, dtype=int),
np.zeros(3, dtype=bool),
np.zeros(3, dtype=np.uint16),
np.zeros(3, dtype=complex))(array([0., 0., 0.]),
array([0, 0, 0]),
array([False, False, False]),
array([0, 0, 0], dtype=uint16),
array([0.+0.j, 0.+0.j, 0.+0.j]))
x = np.zeros(3)
x[0] = 12
x[2] = -1
x.astype(np.float16)array([12., 0., -1.], dtype=float16)
Again, the original x hasn’t been modified (the type’s the same).
xarray([12., 0., -1.])
Note we can create an overflow this way !
x.astype(np.uint8)array([ 12, 0, 255], dtype=uint8)
x = np.array([1e100])
x, x.astype(np.float16)(array([1.e+100]), array([inf], dtype=float16))
np.array(["ala", 2, int])array(['ala', 2, <class 'int'>], dtype=object)
It’s worth to know that the above np.inf (infinity) is float number, but a special one. Also np.nan is a special number that is not a number. In general it’s part of IEEE 754 standard describing the way floats should work on every machine, in every programming language.
a = np.inf
a, a*5, a-4, a*0, -a, a+2, a-a(inf, inf, inf, nan, -inf, inf, nan)
-0.0-0.0
0.00.0
type(np.inf)float
type(np.nan)float
Attention!!! nan is not a nan
np.nan == np.nan, np.nan != np.nan, np.nan < np.nan, np.nan >= np.nan(False, True, False, False)
np.isnan(np.r_[np.nan, 1, 2])array([ True, False, False])
Example bellow, note that dtype='<U4'. It describes all strings of length at most 4. This means that every item of this array will have at most 4 signs, hence constant size. In general it’s a desired behaviour as it’s easier for the computer to work on constant size elements, but if we would have an array of strings of length 2-5, with only one of length 10000 znaków, the array size will explode!
vala = np.array(["Ala", "ma", "kota"])
valaarray(['Ala', 'ma', 'kota'], dtype='<U4')
# Now we create two arrays, they are equal on all but the final index
vala1 = np.array(["Ala", "ma", "kota"] * 100)
vala2 = np.array(["Ala", "ma", "kota"] * 99 + ["Ala", "ma", "kotaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"])
np.array_equal(vala1[:-1], vala2[:-1]), vala1[-1] == vala2[-1](True, False)
Size of a single element and of the whole array:
vala1.itemsize, vala1.itemsize * vala1.size(16, 4800)
vala2.itemsize, vala2.itemsize * vala2.size(128, 38400)
38400 / 48008.0
The second array takes 8 times more space, despite difference at the last element only.
Ok we can create and index an array, so what? To calculate something!
What differs arrays from lists are vectorized operations. This means we can (and should) forget about writing for loops and work on the whole array at once.
x = np.arange(4)
xarray([0, 1, 2, 3])
x * 2array([0, 2, 4, 6])
x ** 2array([0, 1, 4, 9])
np.sin(x)array([0. , 0.84147098, 0.90929743, 0.14112001])
What is important (from the performance side of view), all np.ndarrays live as arrays in C. Writing y = x * 2 means roughly executing the following code in C and later returning the result to python.
int* fun(int* x, int n) {
int *y = new int[n];
for (int i=0; i<10; i++)
y[i] = x[i] * 2;
return y;
}Only numpy is a bit easier to use :)
y = np.array([4, 1, 2, 1])
x, y(array([0, 1, 2, 3]), array([4, 1, 2, 1]))
Multiplication element by element:
x*y, x/y, x+y, x-y(array([0, 1, 4, 3]),
array([0., 1., 1., 3.]),
array([4, 2, 4, 4]),
array([-4, 0, 0, 2]))
x % 2 # remainder from division by 2array([0, 1, 0, 1])
Similarly task:
For vectors
xandyfind the maximum value of sine square of theirs multiplication element by element
Can be solved like:
z1 = x * y # multiplication element by element
z2 = np.sin(z1) # sine
z3 = z2 ** 2 # it's square
z4 = np.max(z3) # maximum of an array
z40.7080734182735712
But we can also do the following one-liner (which is preferred):
(np.sin(x * y) ** 2).max()0.7080734182735712
As above, they’re vectorized!
x, x > 1(array([0, 1, 2, 3]), array([False, False, True, True]))
x != 1array([ True, False, True, True])
x == 3array([False, False, False, True])
xb1 = np.array([True, False, False, True, True])
xb1array([ True, False, False, True, True])
~xb1array([False, True, True, False, False])
xb2 = np.array([True, False, True, False, True])
# Logical and
xb1 & xb2array([ True, False, False, False, True])
# Logical or
xb1 | xb2array([ True, False, True, True, True])
But a special attention to moments in which you mix != / == with &/|!!! You have to place parenthesis () around != / == expressions.
It’s due to operators order similarly like you must add parenthesis to make \(2 + 2 * 2 == 8\).
x > 1 & x < 3ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
(x > 1) & (x < 3)array([False, False, True, False])
If x * y multiplies vectors element-by-element, how can we get scalar product, multiply matrices?
We can do it with @ operator!
x, y(array([0, 1, 2, 3]), array([4, 1, 2, 1]))
Scalar product:
Please note that numpy is not really strict when it comes to horizontality/verticality of vectors like bellow. In many programming languages (and in math) \(y\) has to be a vertical vector for this operation to make sense.
x @ y == (x * y).sum()True
M1 = np.arange(12).reshape((3, 4))
v1 = np.array([0, 1, 2, 3])If it makes sense numpy will multiply each row of matrix by the other vector, as long as they have right shapes. Checkout broadcasting
M1 * v1array([[ 0, 1, 4, 9],
[ 0, 5, 12, 21],
[ 0, 9, 20, 33]])
M1 @ v1array([14, 38, 62])
v1 @ M1 # wrong shapesValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 4)
M2 = np.arange(9).reshape((3, 3))
v2 = np.array([0, 1, 2])v2 @ M2array([15, 18, 21])
Fortunately there’s a limit to what numpy can take before it raises and error 🙈
v2.reshape((3, 1)) @ M2ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 3 is different from 1)
Poniższa operacja nie ma sensu w matematyce. The operation bellow doesn’t make any sens in math.
v2 @ M2 @ v260
But numpy still gives an answer ¯\_(ツ)_/¯. Sometimes it may lead to weird and unexpected errors.
Mathematically speaking we should do \(v_2 M_2 v_2^T\).
We often concatenate arrays. Bellow I present a few ways to do it.
x1 = np.array([0, 1, 2])
x2 = np.array([3, 4, 5])
np.hstack([x1, x2])array([0, 1, 2, 3, 4, 5])
np.vstack([x1, x2])array([[0, 1, 2],
[3, 4, 5]])
In machine learning, it’s useful to concatenate arrays (tensors) over a certain dimension. Often when working with images we have a single tensor with multiple images of the same size. Then we can think about 4 dimensions (B, C, H, W):
M1 = np.arange(9).reshape((1, 1, 9, 1))
M2 = np.arange(9, 18).reshape((1, 1, 9, 1))np.concatenate((M1, M2), axis=0).shape(2, 1, 9, 1)
np.concatenate((M1, M2), axis=1).shape(1, 2, 9, 1)
np.concatenate((M1, M2), axis=2).shape(1, 1, 18, 1)
np.concatenate((M1, M2), axis=3).shape(1, 1, 9, 2)
As you’ve seen above, instead of writing np.max(x), I wrote x.max() by taking advantage of pythons objectivity. Bellow I see a few of most commonly used methods.
a2array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
a2.min(), a2.max(), a2.sum()(0, 11, 66)
xr = np.random.rand(10)
xrarray([0.0148735 , 0.8620361 , 0.6282577 , 0.89021664, 0.58155447,
0.300763 , 0.31205726, 0.54115428, 0.02696092, 0.29733988])
xr.argmin(), xr.argmax()(0, 3)
xr.round(2)array([0.01, 0.86, 0.63, 0.89, 0.58, 0.3 , 0.31, 0.54, 0.03, 0.3 ])
Additionally a lot of those methods have an axis parameter that allows choosing axis along which you want to apply them. In case of matrices you can chose whether you want to sum the whole matrix, axis=None (the default), along columns axis=0, or along rows axis=1.
a2.min(axis=0), a2.max(axis=0), a2.sum(axis=0)(array([0, 1, 2, 3]), array([ 8, 9, 10, 11]), array([12, 15, 18, 21]))
a2.min(axis=1), a2.max(axis=1), a2.sum(axis=1)(array([0, 4, 8]), array([ 3, 7, 11]), array([ 6, 22, 38]))
We will often use boolean arrays. Put a special attention to those two cases:
x = np.random.rand(1000)
y = x > 0.3
y.mean(), y.sum() # fraction and number of elements above 0.3(0.727, 727)
A sharp eye will notice that when dealing with python lists I’ve used the built-in sum, while now I’ve used np.sum or x.sum(). So what’s the difference? It’s massive! Let’s make the following experiment and compare execution time of those sums.
n = 1_000_000
np.random.seed(49951108)
x_sum = np.random.rand(n)
sum(x_sum), np.sum(x_sum), x_sum.sum()(499898.47315079294, 499898.4731508096, 499898.4731508096)
%timeit sum(x_sum)91.6 ms ± 5.52 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit np.sum(x_sum)724 µs ± 31.1 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit x_sum.sum()690 µs ± 36.4 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
As we can see, using numpy np.sum is 160 times more efficient. It’s not a coincidence. The built-in sum works on python lists, not utilizing high-performance implementation of x array in C underneath. There is almost no difference between np.sum(x) and x.sum(). You can use whatever you are more comfortable with and better suits the context.
Now let’s see what would happen if we wrote this summing function using python loop.
def loop_sum(x):
c = 0.0
for e in x:
c += e
return c
loop_sum(x_sum)499898.47315079294
%timeit loop_sum(x_sum)104 ms ± 1.24 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
The result is the slowest but comperable to the built-in sum.
As we see bellow, the sum calculated in two different ways gives… different results.
Generally speaking in computer science we either talk about int - integers or float - floating position numbers. Bellow discussion applies to float numbers.
sum(x_sum) - np.sum(x_sum)-1.664739102125168e-08
Notation 3.1415e-3 mean number \(3.1415 \cdot 10^{-3}\) i.e. 0.0031415, it’s an engineering notation. We can clearly see that above difference is not equal to 0…
It’s like the result of \(a + b + c\) depended whether I calculate it like \((a + b) + c\) or \(a + (b + c)\).
a = 0.1
b = 0.2
c = -0.3
(a + b) + c, a + (b + c)(5.551115123125783e-17, 2.7755575615628914e-17)
Well, strange. But it can be even weirder…
a + b + c == a + c + bFalse
What’s going on? Variables of float type are represented in memory similarly to how we write \(3.1415 \cdot 10^{-3}\) but computer saves only 31415 and -3. Additionally the computer doesn’t operate in decimal system like we do, but in binary. This means it saves float as \(S \cdot 2^E\) and saving the number sign (plus/minus) where \(S\) - significand, \(E\) - exponent.
Let’s now think about something slightly different and consider how much is \(1/3 \cdot 3\). But we will restrict ourselves to not use rational fractions and operate only on decimal expansion (writing number with dot). So we have \(0.33333333 \cdot 3\) (assume more threes doesn’t fit on our paper sheet). In the end we receive number \(0.99999999 \neq 1\). The computer has an analogous problem, but for him the numbers \(0.1, 0.2, 0.3\) are problematic. All due to the fact he operates in binary.
What are the consequences? Never compare two floats with ==!!!
Why? That’s why…
0.2 + 0.1 == 0.3False
So what should I do?
This:
np.isclose(0.2 + 0.1, 0.3), np.isclose(sum(x_sum), np.sum(x_sum))(True, True)
And in case of arrays/matrices:
np.random.seed(1)
x = np.random.rand(1000)
x2 = (x * x) / x
np.array_equal(x, x2), np.allclose(x, x2) # the first is equivalent of x == x2(False, True)
So we can index arrays as lists, but it’s not the only way! We can also index them with other arrays with ints and bools. It might sound weird but check out the following examples:
Documentation on np.ndarray indexing
x = np.array([10, 42, 1337, -1])
indexer1 = np.array([False, True, True, False])
indexer2 = np.array([0, 0, 3, 2, 1, 3, 1])x[indexer1]array([ 42, 1337])
x[indexer2]array([ 10, 10, -1, 1337, 42, -1, 42])
It allows for super cool combos:
M = np.random.rand(20).reshape((5, 4))
Marray([[0.32580997, 0.88982734, 0.75170772, 0.7626321 ],
[0.46947903, 0.2107645 , 0.04147508, 0.3218288 ],
[0.03711266, 0.69385541, 0.67035003, 0.43047178],
[0.76778898, 0.53600849, 0.03985993, 0.13479312],
[0.1934164 , 0.3356638 , 0.05231295, 0.60511678]])
indexer = M > 0.3
indexerarray([[ True, True, True, True],
[ True, False, False, True],
[False, True, True, True],
[ True, True, False, False],
[False, True, False, True]])
M[indexer]array([0.32580997, 0.88982734, 0.75170772, 0.7626321 , 0.46947903,
0.3218288 , 0.69385541, 0.67035003, 0.43047178, 0.76778898,
0.53600849, 0.3356638 , 0.60511678])
np.array_equal(M[M>0.3], M[indexer])True
(Almost) Nothing new here. We can either sort an array in place or return a copy.
x = np.random.rand(5)
y = np.sort(x) # copy returned, x not changed
x, y(array([0.51206103, 0.61746101, 0.43235559, 0.84770047, 0.45405906]),
array([0.43235559, 0.45405906, 0.51206103, 0.61746101, 0.84770047]))
Descending sorting?
np.sort(x)[::-1]array([0.84770047, 0.61746101, 0.51206103, 0.45405906, 0.43235559])
z = x.sort() # Sorts x in place and returns None
print(z)None
xarray([0.43235559, 0.45405906, 0.51206103, 0.61746101, 0.84770047])
np.where is a commonly used function, worth memorizing. The second argument might be either a single element or an array.
x = np.arange(10)
np.where(x > 5, "Large", "Small")array(['Small', 'Small', 'Small', 'Small', 'Small', 'Small', 'Large',
'Large', 'Large', 'Large'], dtype='<U5')
labels = [f"l_{i}" for i in range(10)]
labels['l_0', 'l_1', 'l_2', 'l_3', 'l_4', 'l_5', 'l_6', 'l_7', 'l_8', 'l_9']
np.where(x > 5, labels, "Small")array(['Small', 'Small', 'Small', 'Small', 'Small', 'Small', 'l_6', 'l_7',
'l_8', 'l_9'], dtype='<U5')
np.where(x > 5, np.where(x>7, "Extra large", "Large"), "Small")array(['Small', 'Small', 'Small', 'Small', 'Small', 'Small', 'Large',
'Large', 'Extra large', 'Extra large'], dtype='<U11')
I assume you’re not here to understand only basics, but you are interested in advanced stuff as well! Now I’ll present some subtleties and tricks of numpy that will allow you to become a pro numpy used.
np.r[] - I don’t want to memorize np.array, np.arange, np.linspace and other.When you write a lot in numpy, you try to find as many shortcuts as possible. Some of those shortcuts are np.r_[] and np.c_[] (from row and column).
In the most basic form it can replace np.array.
np.r_[0, 1, 2]array([0, 1, 2])
But also np.arange.
np.r_[:3], np.r_[3:7:2](array([0, 1, 2]), array([3, 5]))
And np.linspace! Note that technically speaking the last number equals n imaginary units.
np.r_[0:5:4j], np.r_[0:5:11j](array([0. , 1.66666667, 3.33333333, 5. ]),
array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5, 5. ]))
It’s not everything, you can also create matrices.
np.c_[[0, 1], [2, 3]]array([[0, 2],
[1, 3]])
Or substitute np.hstack
x = np.r_[:3]
np.r_[x, x], np.c_[x, x](array([0, 1, 2, 0, 1, 2]),
array([[0, 0],
[1, 1],
[2, 2]]))
np.r_[:3, :5, :3]array([0, 1, 2, 0, 1, 2, 3, 4, 0, 1, 2])
Sometimes when you’re reshaping, you don’t want to calculate the last dimension, that is obvious as it is the the only fitting number. There’s a trick for that!
# automatically calculated 111//3 == 37 and substituted for -1
np.arange(111).reshape((3, -1)) array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36],
[ 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62,
63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73],
[ 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99,
100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110]])
In many physical, chemical, economical simulations we cannot get around writing some loops. As we already stated, loops are very slow in python.
Thankfully there’s a solution for this as well!
numbaLet’s calculate the sum, we calculated some time ago. We will briefly discuss what numba does later on.
from numba import jit@jit
def loop_sum2(x):
c = 0.0
for e in x:
c += e
return c
loop_sum2(x_sum)499898.47315079294
As a remainder, times of basic loop_sum function and numpy implementations.
%timeit loop_sum(x_sum)110 ms ± 2.06 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit x_sum.sum()707 µs ± 24.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit loop_sum2(x_sum)1.17 ms ± 57 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
As we can see our solution is slower than numpy but only twice, not 160 times!!!
The obvious question is “How it is possible?”.
numba tries to compile your python code into C and make some optimizations along the way. It is designed to work with C-friendly features. You cannot use python lists inside numba jited (jit - just-in-time compilation) function (you can, but the performance gain will be small), but you can use np.ndarray. If you want to return many values from jited function, use tuple instead of the list. The perfect code for numba jiting is the one with lot’s of loops and ifs, with some numpy arrays.
Let’s checkout next simulation.
A popular case in which you cannot vectorize a particular operation is stochastic differential equation solving.
Exercise:
You have $100 in the beginning. You diversify risk and take two strategies at once. Each day you invest: 1. 10% of your money into investment that will give you back your investment multiplied by 10 with 10% success rate. 2. 50% of your money into investment that will give you back your 50% plus 2% of the invested amount.
Make 10 000 simulations and count in how many cases, and after how many days you will reach $10 000 and in how many cases you will fall bellow $10.
def simulation1(n, seed = 1):
np.random.seed(seed)
days = np.zeros(n)
wins = np.zeros(n, dtype=bool)
for i in range(n):
d = 1
M = 100
while True:
inv1 = 0.1 * M
inv2 = 0.5 * M
if np.random.rand() < 0.1:
M += inv1 * 9
else:
M -= inv1
M += inv2 * 0.02
if M > 10_000:
days[i] = d
wins[i] = True
break
if M < 10:
days[i] = d
break
d += 1
return days, wins
days, wins = simulation1(100_000)%timeit simulation1(100_000)5.35 s ± 269 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# Wins
wins.sum(), days[wins].min(), days[wins].mean(), np.median(days[wins]), days[wins].max()(2659, 12.0, 167.72658894321174, 139.0, 1255.0)
# Looses :(
(~wins).sum(), days[~wins].min(), days[~wins].mean(), np.median(days[~wins]), days[~wins].max()(97341, 25.0, 105.40484482386661, 72.0, 1298.0)
import matplotlib.pyplot as plt# Prettier plot on one of the following lessons ^^
plt.hist(days[wins], bins=50)
plt.xlabel("Days")
plt.title("Wins")Text(0.5, 1.0, 'Wins')
plt.hist(days[~wins], bins=50)
plt.title("Looses")
plt.xlabel("Days")Text(0.5, 0, 'Days')
Executing this code took just a few seconds, but what if we need longer simulations? How to use numba here?
@jit
def simulation2(n, seed = 1):
np.random.seed(seed)
days = np.zeros(n)
wins = np.zeros(n, dtype=np.bool_) # We must introduce a fix here to make it work
for i in range(n):
d = 1
M = 100
while True:
inv1 = 0.1 * M
inv2 = 0.5 * M
if np.random.rand() < 0.1:
M += inv1 * 9
else:
M -= inv1
M += inv2 * 0.02
if M > 10_000.0:
days[i] = d
wins[i] = True
break
if M < 10.0:
days[i] = d
break
d += 1
return days, wins
_ = simulation2(100_000)%timeit simulation2(100_000)57.6 ms ± 433 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Almost 100 times faster… Potentially we can shorten a day-long experiment into 15 minutes, it’s a lot!
Can we do better? Yes we do, you can utilize parallelism and GPU for that. Read the documentation if you’re interested and go thorough provided examples.
Obviously it’s just a fraction of numpy capabilities. You should also check:
np.linalg - matrices decompositions, eigenvalues etc.np.fft - fast Fourier transformnp.random - different distributions and random numbers generationnp.polynomial - working with polynomialsnp.histogram - histogram that returns sole numbersnp.einsum - super smart calculations on tensors